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ABSTRACT 

The purpose of this study was to examine the 
reliability and validity of a basal reading series mastery test. - 
Subjects were 21 fourth graders, who were tested once on the SRA 
Reading Achievement Test, twice on the Holt Basic Reading Series 
Management Program Level 13 Test (MPLT) , and once on the Word Reading 
Test. Traditional psychometric correlational analyses were applied to 
the data to investigate the following dimensions of the technical 
adequacy of the MPLT: test-retest' reliability, criterion-related 
validity with respect to two other measures of reading proficiency, 
and convergent and discriminant validity. Results indicated 
criterion-related validity of the MPLT was acceptable, but questioned 
the test-retest reliability and the convergent and discriminant 
validity. Implications for the development and the use of 
criterion-referenced tests are discussed. (Author) 
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Abstract 

The purpose of this study was to examine the reliability and 
validity of a basal reading series mastery test, * Subjects were 21 
fourth graders, who were tested once on the SRA Reading Achievement 
Test, twice on the Molt Basic Reading Series Management Program Level 
13 Test (MPLT), and once on the Word Readinq Test.. Traditional 
psychometric correlational analyses were' applied to the data to 
investigate the following dimensions of the technical adequacy of the 
MPLT: (a) test-retest rel iabi 1 ity, (b) criterion-related validity 
with respect to two other measures of reading proficiency, and (c) 

convergent and discriminant validity. Results indicated criterion- 

\ * 
related validity of the MPLT was acceptable, but questioned the test- 
retest reliability and the convergent . and discriminant validity. 
Implications for the development and use of criterion-referenced tests 
are discussed. 
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The Technical Adequacy of a Basal Reading Mastery Test: 
The Holt Basic Reading Series 

The development and 'use of criterion-referenced tests have 
proliferated in the past two decades. \Traditiona1 norm-referenced 
measurement has been criticized severejjy because it typically is 
global and lacks content and face val idity * with respect to school 
programs. As an alternative, criterion-referenced tests frequently 
are isomorphic with respect to classroom curriculum. 

Despite, or perhaps due to such high content and face validity, 
there has been scant empirical investigation of psychometric 
characteristics of criterion-referenced tests . Inspection of eight 
commercial criterion-referenced tests and four basal reading mastery 
tests (Tindal, Shinn, Fuchs, Fuchs, Deno, & Gerrnann, 1983) revealed 
that only one-third of test manuals addressed reliability and validity 
at all and authors of only two tests investigated more than one aspect 
of psychometric adequacy. 

Recent investigations of available criterion-referenced basal 
reading mastery tests (Fuchs, Tindal, Shinn, Fuchs, Denb, & Gerrnann, 
1983; Tindal, Fuchs, Fuchs, Shinn, Deno, & Gerrnann, 1983; Tindal, 
Shinn, Fuchs, Fuchs, Deno, & Gerrnann, 1983) document traditional 
psychometric wisdom: Face and content validity are not synonymous 
with technical adequacy. The reliability and validity of a mastery 
test from the Houghton-Mifflin reading series were less than adequate 
for the decoding and comprehension test scales (Tindal, Shinn, Fuchs, 
Fuchs, Deno, & Gerrnann, 1983). The adequacy of a Gi nit 720 mastery 
test "was acceptable for the total test score, but variable for the 
subtests (Fuchs et al., 1983), and the reliability and validity of a 
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Scott -Foresman mastery test was fairly high (Tindal, Fuchs, Fuchs, 
Shinn, Heno, & Germann, 1983). Such findings underscore the necessity 
for investigating psychometric properties of each criterion-referenced 
test separately. Therefore, the purpose of the current study was to 
examine the reliability and validity of another basal series mastery 
test, one in the Holt Basic Reading Program Series. 

Method 

Subjects 

Subjects were 21 students ,(8 M, 13 F) from one fourth grade class 
representing a school district within a rural midwestern cooperative. 
The students' mean reading percentile rank was 49.4 (SD = 24.1) as 
"measured on the Science Research Associates (SRA) Reading Achievement 
Test. 

Measures , p 

Three measures of reading performance were used in the study: a 
basal series criterion-referenced test, a global norm-referenced test, 
and a curriculum-based word reading test. 

- Criterion-referenced test . Four scales of the Management Program 
Level Test (MPLT; Rosenbaum & O'Desky, 1980), Level 13 of the Holt 
Basic Reading series were employed as measures. Each of the four 
seal es , Comprehens i on/L i ter ary Sk i 1 1 s , Decod i ng/Encod i ng Sk i 1 1 s , 
Language Skills, and Study Skills, is comprised of subtests. Table 1 
lists the subtests constituting each scale and provides brief 
descriptions of tasks the- examinee is required to do within subtests. 
This MPLT is criterion-referenced, with items per subtest ranging from 
4 to 20, with items per scale ranging from^l2 to 40, arid with mastery- 

7 
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nonmastery cutoff scores on scales established at 67% to 74% correct 
responses. 



Insert Table 1 about here 



Norm-referenced test . The Science Research Associates (SRA) 
Read i ng Ach i evement Test (Naslund, Thorpe, & Lefever, 1978) is 
ccnprised of two subtests: vocabulary and comprehension. In the 
vocabulary section, examinees are required to select i from four 
alternatives, ^synonym for an underlined word in a sentence. In the 
comprehension section, examinees read 200-300 word passages and answer 
questions in a multiple choice format. Tot a 1 -est score is based on a 
linear combination of the two subtests. Internal consistency 
reliability was reported at .88 (Salvia & Ysseldyke, 1981). ' ■ 

Curriculum-based word reading testv The Word Reading Test (Deno, 
Mirkin, & Chiang, 1982) requires children to read. aloud passages and 
isolated word "lists and is scored in terms of average numbers of words 
correct and incorrect over two alternate forms of the Isolated Word 
Reading and Passage Reading scales. The 200-word passages are drawn 
randomly from a student's grade appropriate basal reading book; the 
150-word lists sample words randomly from the basals, with 60& of the 
words drawn from the student's grade appropriate level and 40% sampled 
equally from all previous levels. For the passage and isolated Word 
Reading Test, test-retest and alternate form rel iabil ities were at 
least .90 (Fuchs, Deno, & Marston, in press; Fuchs, Wesson, Tindal, 
Mirkin, & Deno, 1981). 
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Procedure 

All students were tested in groups by a school psychologist on 
the SRA Reading Achievement Test, and by their classroom teacher on 
the MPLT. The Word Reading "est was administered individually by 
trained aides. Standardized administration procedures were adhered to 
on all tests. Testing time ranged from 60 to 90 minutes for the SRA 
Te^t, 60 to 90 minutes for the MPLT, and five to six minutes for the 
Word Reading Test. Students were administered the following measures 
in the following order within a two-week period: The MPLT, the SRA 
Reading Achievement Test, the Word Reading Test, and the MPLT again. 
Data Analysis 

Test-retest reliability was assessed by correlating scores from 
the two administrations of the MPLT. Criterion validity was 
determined by correlating MPLT scores with two criterion measures,, the 
SRA Reading Achievement Test and the Word Reading ;Test. Finally, 
convergent and discriminant validity was explored by examining 
correlations among MPLT scales and correlations among scale subtests 
and between subtest scores with their respective scale scores. 

Results 

Table 2 is a display of students' mean scores and standard 
deviations on the subtest and total scores of the SRA Reading 
Achievement Test, on the isolated word reading and passage reading 
scales of the Word Reading Test, and on each subtest- and scale as well 
as the total of the MPLT. 

/ . 
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Insert Table 2 about here 



Test-retest reliability 

Test-retest reliability coefficients are displayed i^ Table 3. 
They ranged from .20 for the Language Skills scale to .79 for the 
Comprehension/Literary Skills scale. For the total test, test-retest 
reliability was .77. 



Insert Table 3 about here 



Criterion-related Validity ^ 

Correlational analyses were conducted between the MPLT scales and 
two criterion measures, the SRA Reading Achievement Test and tfye Word 
Reading Test. Correlations between the MPLT scales and the SRA 
subscale and total test scores are displayed in Table 4. They ranged 
from .62 to .90 when SRA vocabul ary subtest scores were involved; from 
.71 to .90 when SRA comprehension subtest scores were employed; and 
from .72 to .95 when SRA total score was used. The median correlation 
for v MPLT Comprehension/Literary Skills scale was .82; for 
Decoding/Encoding Skills, .71; for Language Skills, .71; *nd for Study 
Skills, .81. For the total test score, the median correlation was 
.90. 

* ; 

Insert Table 4 about, here 
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Correlations between the MPLT scales and the Word Reading Test 
scale scores are displayed; 1n Table 5. They ranged from .55 to .75 
when Isolated word reading score was Involved, and from .46 to .86 
when passage reading score was employed. The median correlation for 
the MPLT Comprehension Literary Skills scale was .770; for the MPLT 
Decoding/Encoding Skills scale, .695; for the MPLT Language r ills, 

scale, .505; and for the MPLT Study 'Skills scale, .575. The median 

t 

correlation for the Total Test Score was .805. 



Insert Table 5 about here 
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Converqent and Discr iminant Validity 

* ' " — ! 

Correlations among the MPLT scales and between the scales and . 
total score are presented in Table 6; correlations among subtest 
scores and between subtest and respective scale scores are displayed 
for . each of the four scales in Tables 7-10. Between the MPLT scales, 
correlations ranged from .53 to .73. Scale scores correlated with the 
total score between .77 and .94. 



Insert Tables 6-10 about here 



Within the Comprehension/Literary Skills scale (see Table 7), 
intersubtest correlations fell between .25 and .55. Subtests 
correlated with the total scale score an average .72 (SD = .14). The 
three Decoding/Encoding subtest correlations (see Table 8) were -.59, 
-.28, and .69. The average correlation between the subtest and scale 

li 
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scores was .54 (SO » .47). For the Language Skills scale .(see Table' 
9), Intersubtpst correlations ranged from .10 to .39 f and the averaqe 
correlation between the subtest and scale scores was .69 (SD ..lit. 
Intersubtest correlations for the Study Skills scale (see Table 10) 
ranged between' -.23 and .56; the average correlation between the 
subtest and scale scores was .68 (SO * .18). To summarize this 
Information concerning the convergent and discriminant validity of the 
MPLT, Table 11 displays ranges of correlations for each scale (a) with 
other scales, (b) with 1ts ( own subtests, and (c) among Its subtests. 

- ~ ------------ 

Insert Table 11 about here 
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Discussion 

The purpose of the current study was to describe the reliability, 
and validity of a basal reading series, criterion-referenced mastery 
test.' The study examined three aspects of the technical adequacy of,, 
the Holt Basic Reading Series Management Program Level Test (Level 
13): (a) test-retest reliability, (b) criterion-related validity with 
respect to two other measures of reading proficiency/ which have 
demonstrated psychometric strength, and (cV convergent and 
discriminant validity. Results suggested that the technical adequacy 
of the Holt MPLT was s variable, with many-indices less than adequate. 

Tes ? t-retest reliability coefficients indicated- that, when the 
MPLT was administered tw^ce^within a short' time interval, student 
performance' was inconsistent. ;Noge of the correlations obtained for 
the scales or for-ifhe total test fell within the acceptable range even 
for making ..group decisions (Sal via & Ysseldyke, 1981). 



: Correlational analyses indicated that the criterion-related 
validity of the MPLT with respect to the SRA Reading Achievement Test 
was good, with 63% of correlations between the MPLT and the SRA 
subtests fall ing -above .70 and 38% above .80. Correlations for the 
Comprehension/Literary Skills scale were consistently, highest. With 
the Word Reading Test, correlations between the MPLT and the "Word 
Reading-Test scales were- somewhat lower, with 38% falling above .70 
and none above .80. Again, correlations for the 

Comprehension/Literary r: Skills scale were consistently highest. 
Analysis of Table 1 reveals that taste on the Comprehension/Literary 
Skills scale ;are most global, requiring examinees on three of four 
subtests to read paragraphs and answer multiple dioice; questions (ai 
is done,on the SRA Comprehension Scale), and on the fourth subtest to 
provide synonyms for underlined words (as is done on the SRA 
Vocabulary Scale), therefore, it is not surprising that correlations 
for this Comprehension/Literary scale were higher than for other MPLT 
v sca,lest for which test behaviors are more discrete : and less similar to 
'tasks on either criterion measure of reading achievement . Results 
suggest , that performance on the MPLT,. especially the; 
Comprehension/Literary Skills scale, predicts concurrent performance 
on? more. global measures of reading proficiency moderately well . 

the convergent and discriminant validity of the MPLT appeared to 
be less adequate. ^Correlations between"~the different scales were 
similar in range to that of^cor^l at ions between scales and their own 
subtests. Further, correlations among subtests within each scale were 
comparatively low. "These results suggest that the MPLT scales may not 



© ... • 

• ', ■ ' • ~ " : " . . • s 9 

measure separate, distinct variables. Of course, in interpreting 
these findings, a note of caution 1s - necessary: . Correlations among 
subtests and between subtests and scales may fall low relative to the 
.between-scale statistics due to the comparatively few items and 
restricted range of subtests. 

Additionally, analyses employed in the present investigation wece 
traditional correl ational approaches to the study ■ of psychometric 
characteristics. Such traditional ways of assessing test adequacy 
have been criticized as /largely inappropriate for criterion-referenced 
instruments (Popham & Husek, 1969). Nevertheless, findings of 
previous studies,, which employed both traditional and alternative, 
.criterion-referenced strategies for studying psychometric 
characteristics (Fuchs et al., 1983; Tindal, Shinn, Fuchs, / Fuchs, 
Oeno, & Germann, 1983; Tindal, Fuchs, Fuchs, Shinn, Deno, & Germann, 
1983), indicated that results from, the two strategies support each 
other . This suggests that one can interpret . the traditional 
correlational findings of this, study as meaningful . Of course, 
criterion-referenced analyses of the technical adequacy of the MPLT 
would provide useful, additional descriptive information. 

Consequently, the current study suggests that the Holt MPLT 
varied in quality. . For predicting global reading prof iciency, the 
MPLT appeared useful. However, for making decisions about student 
placement and progress within the curriculum, results were "iess 
favorable. Test-retest reliability of the MPLT was lin acceptably low, 
and the convergent and discrminant validity suggested problems in 
interpreting scale scores meaningful ly. Thi;s indicates that (a) 



educators should use the MPLT with caution for making decisions about 
mastery in the curriculum; and (b) test developers at Holt might 
consider reexamining the test. Additionally, this study "adds to a 
growing body of evidence (Fuchs et al., 1983; Tindal, Fuchs,. Fuchs, 
Shinn, Deno, & Germann, 1983; Tindal , Shinn, . Fuchs, Fuchs, Deno, & 
Germann, 1983) suggesting that, despite the high content and face 
validity of criterion-referenced tests, their, meaningfulness and 
accuracy remain empirical questions. Test consumers must demand such 
empirical validation- before relying on criterion-referenced "test data 
for making instructional decisions. % 
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Table 1 

Examinees 1 Tasks on the Holt Basic Reading MPLT 



Scale 



Examinees 1 Tasks 



Comprehension/Literary Skills 
Subtests 1-3 



Subtest 4 

Decoding/Encoding Skills 
. Subtests 1-2 

Subtests 

Language Skills 
Subtest 1 

Subtest 2 
Subtest 3 

Study Skills 
Subtest 1 



Read stories and answer multipl e choice ques- 
tions concerning sequence of events, setting, 
identifying roles, identifying-'plot, inferring 
theme, inference, identifying fact vs. opinion, 
recalling details, gleaning vocabul ary via . 
context clues, identifying main ideas, identi- 
fying real ism vs . fantasy, and identifying 
simil ie.s vs. metaphors. 

Read a sentence with an underlined word. Fcom 
ah array of four choices, select a synonym for 
the underlined word. 



Given a key word with an underl ined, sound,; 
select from among four choices, thos,e words 
which contain the sound. (Included sounds are: 
[ae], [e], [ij, [a], [a], [ir], [ar], [or].) 

Given a two-syllable key word, select the correct 
syllabic division from two choices . 



Given a key word, identify an antononymous pre- 
fix, from an array of four choices. 

Given a declarative sentence, identify from an 
array of four choices, the first word of the 
question form^of the sentence. 

Given a compound' sentence, select the pair of 
sentences that were combined to make the com- 
pound sentence, from an array of three pairs. 



Given three words with a space preceding and 
following each word and given a fourth word, 
select the space where the fourth word fits 
alphabetically. 
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Table 1 (continued) 



Scale- 



Examinees 1 Tasks 



';■ Subtest 2 , 



Subtest* 3 



Subtest 4 



Given a word and four pairs of dictionary 
guicie words, select the guide words that would, 
be found on the dictionary page .containing the 
word. 

Answer multiple choice questions concerning 
locating words in a dictionary and dictionary 
structure. 

Answer riiultiple choice questions concerning 
references in encyclopedia vol umes , and facts 
about encyclopedias. 
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Table 2 

Student Performance on Measures of Reading Achievement 



Test - . 


Mean 


SD\ 


SRA Reading Achievement^Test (N = 20) 




- — ■ \ ■ 



Vocabulary *^~~~-26.1^ 

Comprehension , 29.1 

Total 55.1 " 

Word Reading Test (N = 21) 

Isolated Word Reading 62.1. 21.5 

Passage Reading 124.0 42.6 

Holt Basic Reading MPLT (N = 19) . 

Comprehension/Literary Skills 26.1 5.9 

Subtest 1 5,4 1.6 

Subtest 2 - 3.1 1 .6 

Subtest 3 2.3 1.2 

Subtest 4 15.1 3.1 

Decoding/Encoding Skills 14.1 2.1 

Subtest 1 6.2 1.2 

Subtest 2 6.4 1.4 

• c '. Subtest- 3 1 .9 1.1 

Language Skills 7.9 2.1 

Subtest 1 2.3 0'.9 

Subtest 2 2.3 1.2 

Subtest 3 3.4 1.0 

Study Skills 1 . , 13.6 3.3 

Subtest 1 2.6 1.2 

Subtest 2 2.6 0.8 

Subtest 3- 2.8 1,0 

Subtest 4 5.6 1.6 

Total Test 62.1 11.7 
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Table 3 

Test-retest Reliabilities for Holt Basic Reading Test (N=18) 



Scale " . 


Reliability 


Comprehens 1 on/L i tera ry Skill s 


.79 


Decoding/Encoding Skills 


.68 


Language Skills 


,20 


Study Skills 


.45 


'."3 

Total Test 


.77 
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Table 4 

Correlations Between Holt Basic Reading MPLT and SRA Test ScoreSj (N=19) 



Holt Scale 



SRA 

Vocabulary . Comprehension T^tal 



Comprehension/Literary Skills .90, 

Decoding/Encoding Skills .62 

Language Skills .69 

Study Skills .64 

Total Test .87 



.82 

•71 
.71 
.81 
.90 



1 

) 

.72 



.75. 
.-80 
.95- 
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Table 5 

Correlations Between Holt Basic MPLT and Word Reading 

Test Scores (N = 19) 



Holt Scales 



Word Reading Test 
Isolated Words Passages 



Comprehension/Literary Ski 1 1 s 
Decoding/Encoding Skills 
angifage Skills,. 
Study Skills 
Total Test 



,75 
,64 
,55 
,57 
;75 



.79. 
.75 
.46 
.58 
.86 
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Table 6 - 

Relations Among Holt Basic Reading MPLT Scale and Total 





Test Scores 


(N = 19) 












Comprehension 


Decoding/ 








Holt Scales 


Literary. 


Encoding 


Language 


Study 


Total 


Comprehension/ Literary 




.68 


.61 


.73 


.94 


Decoding/Endodi ng 






.53 


.53 


.77 


Language 








.66 


.77 


Study 










.86 




23 



0 
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Table 7 

Relations Among Comprehension/Literary Skills Subtest and 

Scale Scores (N = 19) 

Subtests 



Subtests 


1 2 


3 


- 4 


Scale 


1 


,25 


.25 


.54 


.66 


2 




.36 


.50 


.65 


3 






.55 


.65 


4 








.94 



24 
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Table 8 

Relations Among Decoding/Encoding Skills Subtest and 
Scale Scores (N = 19) 







Subtests 








Subtests 


. 1 2 


3 


Scale 




1 ' 


.69 


-.28 


.87 




2 




-.59. 


.74 




3 ' - 






.00. 





0 



t 



\ 

\ r 

\ 
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Table 9 

Relations Among Language Skills'- Subtest and Scale Scores (N = 19)- 

Subtests ' 



ERIC 



Subtests 1 2 3 Scale 



1 


139 


• .14 


.70 






.10 


.79 


3 






.57 

* • 



26 



.22 . ■ 

"• . ' ' . Table "10,,,.; . 



■ ■ ' Relations Among .Study Skills Subtest and Scale 


Scores 


(N f 19) 

: 


Subtests *' ... - 


Subtests r 

. ...... 1 2 ; 3 


4 


Scale - 


1 , '. 


.'fry ■ .56 *i .42 


.48 


:86 


2 . ... 


-.23 


.34. 


.53 


3 




.23 


5 • v .52 •) 


' 4 ' " - 






. .82 




■ Table 11 .. 

Ranges of Correlations for Each Scale With Scales, 
With Its Subtests, and Among Its Subtests 



. Ranges of Correlations ... 

Scale With Scales With Own Subtests Among Subtests 



Compreh ens i on / L i te ra ry 


.68 - 


.73 


:65' - 


.94 


.25 - 


.55' 


Decoding/Encoding 


.53 - 


.68 


.00 - 


.87 . 


-.59 - 


,69 


Language 


.53 - 


.66 


, .57 - 
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